EY GDS | Azure Data Engineer Interview Experience | 3 YoE



Round 1: Technical Interview (30 Minutes)

📍 Walkthrough of the candidate's past projects, highlighting challenges and solutions.

📍 Questions related to the architecture and implementation of the project.

📍 Understanding of Databricks architecture and its integration with Azure services.

📍 Performance optimization techniques in Databricks.

📍 Fetching Data from Azure Data Lake and Storing it in SQL Database

📍 Write a PySpark script to read data from Azure Data Lake Storage (ADLS) and write it to a SQL database.

📍 Explanation of Inner Join, Left Join, Right Join, Full Outer Join, Cross Join, and Semi Join.

📍 Demonstrate how Inner Join and Left Join behave with sample data.

📍 Write a PySpark script to perform joins between two DataFrames.

📍 Write a PySpark script to drop specific columns from a DataFrame.

📍 Write a PySpark script to filter specific rows based on conditions.

Round 2: Techno-Managerial Interview (45 Minutes)

📍 Deep dive into the design choices, scalability, and optimizations of the candidate's project.

📍 Discussion on real-world challenges faced and how they were tackled.

📍 Handling failed or late records using Azure Stream Analytics.

📍 Best practices for checkpointing, event retention, and data reprocessing.

📍 Top Product Location-wise (Window Function Problem)

📍 Write a PySpark script to find the top product for each location using Window Functions.

📍 Write a PySpark script to identify duplicate records in a DataFrame.

📍 Overview of Spark Architecture (Driver, Executors, Cluster Manager).

📍 Internal Working of Spark (DAG, Stages, Tasks, and Execution Plan).

📍 Types of activities in ADF (Data Flow, Copy, Lookup, Web, ForEach, etc.).

Round 3: HR Discussion

📍 Salary Negotiation & Expectations

📍 Company Culture & Benefits Discussion

📍 Career Growth & Future Opportunities